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BACILLUS STRAINS 



This invention relates to Bacillus strains. We describe below such strains useful for the expression and 
secretion of desired polypeptides (as used herein, "polypeptide" means any useful chain of amino acids, 
including proteins). 

Bacillus strains have been used as hosts to express heterologous polypeptides from genetically 
s engineered vectors. The use of a Gram positive host such as Bacillus avoids some of the problems 
associated with expressing heterologous genes in Gram negative organisms such as E. coli. For example, 
Gram negative organisms produce endotoxins which may be difficult to separate from"a"desired product. 
Furthermore, Gram negative organisms such as E. coli are not easily adapted for the secretion of foreign 
products, and the recovery of products sequestered" within the cells is time-consuming, tedious, and 
70 potentially problematic. In addition, Bacillus strains are non-pathogenic and are capable of secreting 
proteins by well-characterized mechanisms. 

A general problem in using Bacillus host strains in expression systems is that they produce large 
amounts of proteases which can degrade heterologous polypeptides before they can be recovered from the 
culture media. The proteases which are responsible for the majority of this proteolytic activity are produced 
75 at the end of the exponential phase of growth, under conditions of nutrient deprivation, as the cells prepare 
for sporulation. The two major extracellular proteases an alkaline serine protease (subtilisin), the product of 
the apr gene, and a neutral metal loprotease, the product of the npr gene, are secreted into the medium, 
whereas the major intracellular serine protease, lsp-1 , is produced within the cells. Other investigators have 
created genetically altered Bacillus strains that produce below-normal levels of one or more of these three 
2Q proteases, but these strains still produce high enough levels of protease to cause the degradation of 
heterologous gene products prior to purification. 

Stahl et al. (J. Bact., 1984, 158:411) disclose a Bacillus protease mutant in which the chromosomal 
subtilisin structural gene was replaced with an in vitro derived deletion mutation. Strains carrying this 
mutation produced only 10% of the wild-type extracellular serine protease activity. Yang et al. (J. Bact., 
25 1984, 160 :15) disclose a Bacillus protease mutant in which the chromosomal neutral protease gene was 
replaced with a gene having an in vitro derived deletion mutation. Fahnestock et al. (WO 86/01825) describe 
Bacillus strains lacking subtilisin activity which were constructed by replacing the native chromosomal gene 
sequence with a partially homologous DNA sequence having an inactivating segment inserted into it. 
Kawamura et al. (J. Bact., 1984, 160 :442) disclose Bacillus strains carrying lesions in the npr and apr genes 
30 and expressing less than 4% of the wild-type level of extracellular protease activity. Koide et al. (J. Bact., 
1986, 167:110) disclose the cloning and sequencing of the isp-1 gene and the construction of an lsp-1 
negative mutant by chromosomal integration of an artificially deleted gene. 

Genetically altered strains which are deleted for the extracellular protease genes (apr and npr) produce 
significantly lower levels of protease activity than do wild-type Bacillus strains. These bacteria,~when grown 
38 on medium containing a protease substrate, exhibit little or no proteolytic activity, as measured by the lack 
of appearance of a zone of clearing (halo) around the colonies. Some heterologous polypeptides and 
proteins produced from these double mutants are, nevertheless, substantially degraded prior to purification, 
although they are more stable than when produced in a wild-type strain of Bacillus . 

The invention provides improved Bacillus cells containing mutations in one or more of three previously 
40 uncharacterized protease genes; the cells also preferably contain mutations in the apr and npr genes that 
encode the major extracellular proteases, resulting in the inhibition by the cells oTproduction of these 
extracellular proteases. The mutations of the invention include a mutation in the epr gene which inhibits the 
production by the cell of the proteolytically active epr gene product, and/or a mutation in the gene (herein, 
the "RP-r gene) encoding residual protease I (RP-I) which inhibits the production by the cell of 
45 proteolytically active RP-I, and/or a mutation in the gene (herein, the "RP-II" gene) encoding residual 
protease II (RP-II). The proteases encoded by the epr gene and RP-II genes are novel proteins. Most 
preferably, the mutations are deletions within the coding region of the genes, including deletion of the entire 
coding region; alternatively, a mutation can consist of a substitution of one or more base pairs for naturally 
occuring base pairs, or an insertion within the rotease coding region, 
so Bacillus cells in accordance with the invention may additionally contain a mutation in the isp-1 gene 
encoding intracellular serine protease I and may in addition contain a mutation which blocks sporulation and 
thus reduces the cell's capacity to produce sporulation-dependent proteases; preferably, this mutation 
blocks sporulation at an early stage but does not eliminate the cell's ability to be transformed by purified 
DNA; most preferably, this mutation is the spoOA mutation (described below). 

The invention provides, in an alternative aspect thereof, a method for producing stable heterologous 
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polypeptides in a Bacillus host cell by modifying the host to contain mutations in the apr and npr genes and 
in one or more of the genes including the epr gene, the RP-I gene, and the RP-II gene. 

The invention also features, in respective further aspects thereof, purified DNA, expression vectors 
containing DNA, and host Bacillus cells transformed with DNA, in each case encoding one of the proteases 
5 Rp.j, R p.|i t or the product of the epr gene; preferably, such DNA is derived from Bacillus subtilis . 

The invention also features, in yet further aspects thereof, the isolation of substantially pure Epr, 
residual protease I (RP-I), and another previously uncharacterised protease called residual protease II 
(RPII), and characterisation of the RP-I and RP-il proteases; as used herein, "substantially pure" means 
greater than 90% pure by weight. 
w The terms "epr gene", "RP-I gene", and "RP-II gene" herein mean the respective genes corresponding 
to these designations in Bacillus subtilis , and the evolutionary homologues of those genes in other Bacillus 
species, which homologues, as is the case for other Bacillus proteins, can be expected to vary in minor 
respects from species to species. The RP-I and RP-II genes of B. subtilis are also designated, respectively, 
the bpr and mpr genes. In many cases, sequence homology between evolutionary homologues is great 
75 enough so thaTa gene derived from one species can be used as a hybridization probe to obtain the 
evolutionary homologue from another species, using standard techniques. In addition, of course, those 
terms also include genes in which base changes have been made which, because of the redundancy of the 
genetic code, do not change the encoded amino acid residue. 

Using the procedures described herein, we have produced Bacillus strains which are significantly 
20 reduced in their ability to produce proteases, and are therefore useful as hosts for the expression, without 
significant degradation, of heterologous polypeptides capable of being secreted into the culture medium. 
We have found that our Bacillus cells, even though containing several mutations in genes encoding related 
activities, are not only viable, but healthy. 

Any desired polypeptide can be expressed using our techniques, e.g., medically useful proteins such as 
25 hormones, vaccines, antiviral proteins, antitumor proteins, antibodies or clotting proteins; and agriculturally 
and industrially useful proteins such as enzymes or pesticides, and any other polypeptide that is unstable in 
Bacillus hosts that contain one or more of the proteases inhibited in our cells. 

Other features and advantages of the invention will be apparent from the following description of 
preferred embodiments thereof. 
30 The drawings will first be briefly described. 

Fig. 1 is a series of diagrammatic representations of the plasmids p371 and p371A, which contain a 
2.4 kb Hindlll insert encoding the Bacillus subtilis neutral protease gene and the same insert with a deletion 
in the neutral protease gene, respectively, and p371 ACM, which contains the Bacillus cat gene. 

Fig. 2 is a Southern blot of Hindlll digested 1S75 and IS75NA DNA probed with a 32 P-labeled 
35 oligonucleotide corresponding to part of the nucleotide sequence of the npr gene. 

Fig. 3 is a representation of the 6.5 kb insert of plasmid pAS007, which encodes the Bacillus subtilis 
subtilisin gene, and the construction of the deletion plasmid pAS13. 

Fig. 4 is a representation of the plasmid plSP-1 containing a 2.7 kb Bam HI insert which encodes the 
intracellular serine protease ISP-1 , and the construction of the ISP-1 deletion plasmid pAL6. 
4 0 Fig. 5 is a diagrammatic representation of the cloned epr gene, showing restriction enzyme 

recognition sites. 

Fig. 6 is the DNA sequence of the epr gene. 

Fig. 7 is a diagrammatic representation of the construction of the plasmid pNP9, which contains the 
deleted epr gene and the Bacillus cat gene. 
45 Fig7"8 is the amino acid sequence of the first 28 residues of Rp-I and the corresponding DNA 

sequence of the probe used to clone the RP-I gene. 

Fig. 9 is a restriction map of the 6.5kb insert of plasmid pCR83, which encodes the RP-I protein. 

Fig. 10 is the DNA sequence of DNA encoding RP-I protease. 

Fig. 11 is the amino acid sequence of three internal RP-II fragments (a, b, c), and the nucleotide 
so sequence of three guess-mers used to clone the gene (a), (b) and (c). 

Fig. 12 is a Southern blot of QP241 chromosomal DNA probed with BRT90 and 707. 
Fig. 13 is a diagram of (a) a restriction map of the 3.6 kb Pstl insert of pLPI, (b) the construction of 
the deleted RP-II gene and (c) the plasmid used to create an RP-II deletion in the Bacillus chromosome. 
Fig. 14 is the DNA sequence of DNA encoding RP-II. 

.55 

General Strategy for Creating Protease Deleted Bacillus Strains 
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The general strategy we followed for creating a Bacillus strain which is substantially devoid of 
proteolytic activity is outlined below. 

A deletion mutant of the two known major extracellular protease genes, apr and npr, was constructed 
first The isp-1 gene encoding the major intracellular protease was then deleted to create a triple protease 

5 deletion mutant. The spoOA mutation was introduced into either the double or triple deletion mutants to 
significantly reduce any" sporulation dependent protease activity present in the cell. A gene encoding a 
previously unknown protease was then isolated and its entire nucleotide sequence was determined The 
gene, epr, encodes a primary product of 645 amino acids that is partially homologous to both subtilisin 
(Apr) ancfthe major internal serine protease (lsp-1) of B. subtilis . A deletion of this gene was created in vitro 

to and introduced into the triple protease deleted host. A~deletion in a newly identified gene encoding residual 
protease RP-I was then introduced to create a strain of B. subtilis having substantially reduced protease 
activity and expressing only the RP-II activity. RP-II has been purified and a portion of the amino acid 
sequence was determined for use in creating the nucleic acid probes which were used to clone the gene 
encoding this protease. Upon cloning the gene, it was possible to create a Bacillus strain which contains a 

75 deletion in the RP-II gene and is thus incapable of producing RP-II, 

Detailed procedures for construction of the protease gene deletions and preparation of Bacillus strains 
exhibiting reduced protease activity are described below. 



20 General Methods 

Our methods for the construction of a multiply deleted Bacillus strain are described below. Isolation of 
B. subtilis chromosomal DNA was as described by Dubnau et al., (1971, J. Mot Biol., 56: 209). subtilis 
strains were grown on tryptose blood agar base (Difco Laboratories) or minimal glucose medium and were 
25 made competent by the procedure of Anagnostopoulos et al. t (J. Bact, 1961, 81: 741). E. coli JM107 was 
grown and made competent by the procedure of Hanahan (J, Mol. Biol., 1983, 166 : 587). Plasmid DNA from 
B. subtilis and E. coii were prepared by the lysis method of Birnboim et al. (Nucl. Acid. Res., 1979, 7: 
1513). Plasmid DNATransformation in B. subtilis was performed as described by Gryczan et al., (J. Bact, 
1978,134:138). 



Protease assays 

Two different protease substrates, azocoll and casein (Labelled either with U C or the chromophore 

35 resorufin), were used for protease assays, with the casein substrate being more sensitive to proteolytic 
activity. Culture supernatant samples were assayed either 2 or 20 hours into stationary phase. Azocoll- 
based protease assays were performed by adding 100 ul of culture supernatant to 900 ul of 50 mM Tris, pH 
8, 5 mM CaCfe. and 10 mg of azocoll (Sigma), a covalently modified, insoluble form of the protein collagen 
which releases a soluble chromophore when proteolytically cleaved. The solutions were incubated at 37* C 

40 for 30 minutes with constant shaking. The reactions were then centrifuged to remove the insoluble azocoll 
and the As 20 of the solution determined. Inhibitors were pre-incubated with the reaction mix for 5 minutes at 
37* C. Where a very small amount of residual protease activity was to be measured, 14 C-casein or 
resorufin-iabelled casein was used as the substrate. In the 14, C-casein test, culture supernatant (100 ul) was 
added to 100 ul of 50 mM Tris, 5mM CaCfe containing 1 X 10 5 cpm of u C-casein (New England Nuclear), 

45 The solutions were incubated at 37* C for 30 minutes. The reactions were then placed on ice and 20 ug of 
BSA were added as carrier protein. Cold 10% TCA (600 ul) was added and the mix was kept on ice for 10 
minutes. The solutions were centrifuged to spin out the precipitated protein and the supernatants counted in 
a scintillation counter. The resorufin-iabelled casein assay involved incubation of culture supernatant with an 
equal volume of resorufin labelled casein in Tris = CI buffer, pH 8. 0, at 37 *C for various times. Following 

50 incubation, unhydrolyzed substrate was precipitated with TCA and the resulting chromogenic supernatant 
was quantitated spectrophotometrically. 

Deletion of the npr gene 



According to Yang et al. (J. Bact, 1984, 160 : 15), the npr gene is contained within overlapping EcoRI 
and Hindlll restriction fragments of B. subtilis DNA, and a majority of the gene sequence is located on the 
2.4 kb Hindlll fragment. This fragment was chosen for creation of the npr deletion. 



30 
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An individual clone containing the 2.4 kb Hindlll fragment was isolated from a clone bank of genomic 
Hindlll fragments prepared as follows. Chromosomal DNA was isolated from subtilis strain IS75, digested 
with Hindlll and size fractionated by electrophoresis on a 0.8% agarose gel. DNA in the 2-4 kb size range 
was electrocuted from the gel. The purified DNA was ligated to Hindlll digested and alkaline phosphatase 

5 treated pUC9 DNA (an E. coli replicon commercially available from Bethesda Research Labs, Rockville, 
Md), transformed into competent cells of E. coli strain JM107, and plated on LB + 50 ug/ml ampicillin 
resulting in 1000 Amp R colonies. 

Colonies containing the cloned neutral protease gene fragment were identified by standard colony 
hybridization methods (Maniatis et aL, 1983, "Molecular Cloning. A Laboratory Manual", Cold Spring 

w Harbor, New York). Briefly, transformants are transferred to nitrocellulose filters, lysed to release the nucleic 
acids and probed with an npr specific probe. A 20 base oligonucleotide complementary to the npr gene 
sequence between nucleotides 520 and 540 (Yang et al., supra ) was used as the probe. The sequence is 
5'GGCACGCTTGTCTCAAGCAC 3'. A representative clone containing the 2.4 kb Hindlll insert was identified 
and named p371 (Fig. 1). 

75 A deleted form of the npr gene in p371 was derived in vitro . A 580 bp internal Rsal fragment was 
deleted by digesting p371 D"NA with Rsal and Hindlll. The 600 bp Hindlll-Rsal fragment spanning the 5 end 
of the gene and the 1220 bp Rsal-TlirTdlll fragment spanning the 3 # end of the gene (see Fig. 1) were 
isolated and cloned into Hindlll and alkaline phosphatase treated pUC9. This resulted in the deletion of the 
center portion of the npr gene. The ligated DNA was transformed into E. coK JM107. A clone having the 

20 desired deletion withirT"the npr gene was identified by restriction enzyme analysis. This plasmid is 
designated p371A. 

A gene encoding a selectable marker was included on the vector to facilitate the selection of integrants 
in Bacillus . The Bacillus cat gene, encoding resistance to chloramphenicol (Cm r ). was isolated from plasmid 
pMI1101 (Youngman et al., 1984, Plasmid 12:1-9) on a 1.3 kb Sail fragment and cloned into the Sail site of 
25 p371 A. This DNA was transformed into E. coli JM107 and transformants were screened for chloramphenicol 
resistance. A representative plasmid containing both the deleted npr gene and the cat gene was named 
p371ACm (Fig. 1). 

The vector p371 ACm was derived from the E. coli replicon pUCI9 and is therefore unable to replicate in 
a Bacillus host. The wild-type npr gene in the chromosome of the recipient host was exchanged for the 
30 deleted npr gene contained on the vector by reciprocal recombination between homologous sequences. 
The Cm r ~marker gene enabled the selection of cells into which the vector, inclusive of the protease gene 
sequence, had integrated. 

Vector sequences that integrated with the deleted npr gene were spontaneously resolved from the 
chromosome at a low frequency, taking a copy of the npr gene along with them. Retention of the deleted 
35 protease gene in the chromosome was then confirmed by assaying for the lack of protease activity in the 
Cm s segregants. 

Specifically, competent B. subtilis IS75 cells were transformed with p371ACm and selected for Cm'. 
Approximately 2000 colonies? which had presumably integrated the deleted npr gene adjacent to, or in 
place of, the wild type gene, were selected which were resistant to chloramphenicol. Approximately 25% of 

40 the colonies formed smaller zones of clearing on starch agar indicating that the wild-type gene had been 
replaced with the deleted form of the gene. No neutral protease activity was detected in supernatants from 
these cell cultures. In contrast, high levels of neutral protease activity were assayed in culture fluids from 
wild type IS75 cells. Segregants which contained a single integrated copy of the deleted protease genes, 
but which had eliminated the vector sequences were then selected as follows. 

45 A culture of Cm r colonies was grown overnight in liquid media without selection then plated onto TBAB 
media. These colonies were then replicated onto media containing chloramphenicol and those that did not 
grow in the presence of chloramphenicol were identified and selected from the original plate. One such Npr 
negative colony was selected and designated IS75NA. 

Deletion within the npr gene in IS75NA was confirmed by standard Southern blot analysis (Southern, 

50 1977, J. Mol. Biol. 98: 503) of Hindlll digested DNA isolated from B, subtilis IS75N and 1S75NA probed with 
the 32 P-labelled npr specific oligonucleotide. The probe hybridized with a 2.4 kb Hindlll fragment in wild- 
type IS75N DNA~and with a 1.8 kb fragment in IS75N A DNA indicating that 600 bp of the npr gene were 
deleted in IS75NA (see Fig. 2). 

55 

Deletion of the apr gene 

To clone the subtilisin gene (apr) a genomic library from IS75 DNA was first prepared. Chromosomal 
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DNA was isolated and digested with Eco Rl and separated by electrophoresis through a 0.8% agarose gel. 
Fragments in the 5-8 kb size range were purified by electroelution from the gel. The fragments were ligated 
with Eco Rl digested pBR328 DNA (publicly available from New England BioLabs) and transformed into 
competent E. coli JM107 cells. Transformants were screened for plasmids containing apr gene inserts by 
s hybridizing witFTa synthetic 32 P-!abelled 17-mer oligonucleotide probe which was complementary to the apr 
gene sequence between nucleotides 503 and 520 (Stahl et al., 1984, J. Bact. 158: 411). A clone with a 6.5 
kb Eco Rl insert that hybridized with the probe was selected and named pAS007 (Fig. 3). The 6.5 kb 
fragment contained the entire coding sequence of the subtilisin gene. 

A mutant of the apr gene was created by deleting the two internal Hpa l fragments (Fig. 3). pAS007 was 
10 first digested with Hpa T and then recircuiarized by ligating in a dilute solution (5ug/ml) to eliminate the two 
Hpal fragments Approximately 200 Amp r colonies arose following transformation of JM107 cells. One of 
these transformants contained a 4.8 kb Eco Rl insert with one internal Hpa l site. It was designated pASl2. 
The deletion in the apr gene extended 500 bp beyond the 3 end of the gene, however this DNA apparently 
did not contain any genes that were essential to B. subtilis . 
75 A 1.3 kb Sail fragment containing the Bacillus cat gene was cloned into the Sail site of pAS12 
(described above) for selection of integrants in the Bacillus host chromosome. The piasmid DNA was 
transformed into E. coli JM107, plated on media containing ampicillin and approximately 50 Amp r colonies 
were recovered and replica plated onto media containing 7.5 ug/ml chloramphenicol. Three of the 50 
colonies were Cm r . Piasmid DNA was isolated from these three clones and analyzed by restriction 
20 digestion. One of the plasmids had the desired restriction pattern and was named pAS13 (Fig. 3). 

To promote integration of the deleted protease gene into the B. subtilis chromosome, pAS13 was 
introduced into strain IS75NA and selected for Cm r transformants. The transformants were then screened 
for replacement of the wild-type apr gene with the deleted gene by plating on TBAB plates containing 5 
ug/ml Cm and 1.5% casein. Several of the colonies which did not produce halos were selected for loss of 
25 the Cm r gene as described above. A representative transformant was chosen and designated GP199. 

Protease activity was assayed in the culture fluids from the double protease deleted strain, as well as in 
the strain having only the deleted neutral protease gene. Protease activity in Npr~ Apr" mutant cells was 
approximately 4-7% of wild type levels whereas the Npr~ mutant exhibited higher levels of protease 
activity. 

30 

amyE Mutation 

Protease deficient strains were tested in connection with the production of a Bacillus amylase. To assay 

35 the levels of amylase produced by various piasmid constructs it was necessary to introduce a mutant 
amylase gene into the host in place of the wild type gene. This step is not essential to the present invention 
and does not affect the level of protease activity; it was performed only because piasmid encoded amylase 
levels could not be determined in the presence of the chromosomally encoded amylase. The amy E allele 
was transformed from B. subtilis strain JF206 (trpC2, amy E) into QP199 by a transformation/selection 

40 process known as congression. This process relies on the ability of competent B. subtilis cells to be 
transformed by more than one piece of chromosomal DNA when the transforming" DNA is provided in 
excess. The process involves initial selection of competent cells in the population by assaying for 
expression of a selectable marker gene which subsequently facilitates screening for co transfer of an 
unselectable marker, such as inability to produce amylase. 

45 Total chromosomal DNA was isolated from JF206 or a similar strain containing an amy E mutation. 
Saturating concentrations (-lug) were transformed into competent GP199 (met", leu" his - ) and His* 
transformants were selected on minimal media supplemented with methionine and leucine. The transfor- 
mants were screened for an amylase minus phenotype on plates having a layer of top agar containing 
starch-azure. Five percent of the His* colonies were unable to produce halos indicating that the amylase 

50 gene was defective. One such transformant was assayed for the protease-deficient phenotype and was 
designated GP200. 

Supernatant samples from cultures of the double protease mutant were assayed for protease activity 
using azocoll as the substrate. When assayed on this substrate, protease activity in the double protease 
mutant strain was 4% of wild type levels. When the more sensitive substrate u C-casein was used in the 
55 protease assay, the double mutant displayed 5-7% of the wild type B. subtilis activity. Although protease 
activity in this strain was low, we discovered that certain heterologous gene products produced by these 
protease deficient cells were not stable, indicating the presence of residual protease activity. We then 
sought to identify and mutate the gene(s) responsible for the residual protease activity. 
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In order to characterize the residual protease activity, a number of known protease inhibitors were 
tested for their ability to reduce protease levels in cultures of the double protease mutant strain. PMSF 
(phenylmethylsulfonyl flouride), a known inhibitor of serine protease activity, was found to be the most 
effective. The addition of PMSF to growing cultures of Apr" Npr" Bacillus cells successfully increased the 

s stability of heterologous peptides and proteins synthesized in and secreted from these cells. These results 
indicated that at least a portion of the residual degradative activity was due to a serine protease. 

Subtilisin is the major serine protease to be secreted by B. subtilis ; however, the serine protease 
encoded by the isp-1 gene (ISP-1) has been shown to accumulate intracellularly during sporulation 
(Srivastava et al., 1981, Arch. Microbiol., 129: 227). In order to find out if the residual protease activity was 

io due to lsp-1 , a deleted version of the isp-1 gene was created in vitro and incorporated into the double- 
protease deleted strain. 

Deletion of the isp-1 gene 

15 

The isp-1 gene is contained within a 2.7 kb Bam HI fragment of B. subtilis chromosomal DNA (Koide et 
al., 1986~ Bact., 167:110). Purified DNA was digested with Bam HI and fragments in the 2.7 kb size range 
were electroeluted "from an agarose gel, ligated into Bam HI digested pBR328 and transformed into E. coli 
JM107 cells. One Amp r colony that produced a halo on LB media containing 1% casein was selected and 

20 named plSP-1. Restriction analysis of the DNA indicated that plSP-1 carried a 2.7 kb Bam HI insert which 
hybridized with a synthetic 25 base 32 P-labeled oligonucleotide probe [s'ATGAATGGTGAAATCCG- 
CTTGATCC 3'] complementary to the isp-1 gene sequence (Koide et al, supra) . The restriction pattern 
generated by Sail and EcoRI digestions confirmed the presence of the isp-1 gene in plSP-1. 

A deletion~was created within the isp-1 gene by taking advantage of a unique Sail site located in the 

25 center of the gene. Because there was an additional Sail site in the vector, the 2.7 kb Bam HI gene insert 
was first cloned into the Bam HI site of a derivative of pBR322 (pAL4) from which the Sail site had been 
eliminated (Fig. 4). The resulting plasmid, pAL5, therefore had a unique Sail site within the isp-1 gene pAL5 
DNA was digested with Sail, treated with Bal31 exonuclease for five minutes at 37* C to delete a portion of 
the gene sequence, andTeiigated. The DNA was transformed into JM107 and resulting Amp r colonies were 

30 screened for a BamHI insert of reduced size. A plasmid with a 1.2 kb deletion within the Bam HI insert was 
selected and named pAL6 (Fig. 4). 

The cat gene was purified from the E. coli plasmid pMI1101 on a Sail fragment as above and cloned 
into pAL6~at the EcoRV site. The resulting "DNA was transformed into the double protease mutant strain 
(GP200) and integrants containing the deleted ISP-1 gene were selected as described above. The triple- 

35 protease deleted strain is called GP208 (aprA, nprA, isp-1 A). Using a casein substrate, protease activity 
was measured in the triple-mutant strain (Apr", Npr", Isp-1 ") and found to be 4% of the wild type level, 
about the same as the double mutant strain. 

The remaining 4% residual protease activity was apparently due either to a previously described 
esterase called bacillopeptidase F (Roitsch et al., 1983, J Bact., 155: 145), or to previously unknown and 

40 unidentified protease gene(s). 



introduction of a sporulation mutation 

45 Because it had been shown that the production of certain proteases was associated with the process of 
sporulation in B. subtilis, we reasoned that it may be useful to include a mutation which blocked sporulation 
in our protease~deficient hosts and thus further reduce sporuiation-dependent protease production in these 
strains. Mutations that block the sporulation process at stage 0 reduce the level of protease produced, but 
do not eliminate the ability of the cells to be transformed by purified DNA. spoOA mutations have been 

so shown to be particularly efficient at decreasing protease synthesis (Ferrari et al., 1986, J. Bact. 166:173). 

We first introduced the spoOA mutation into the double protease deficient strain as one aspect of our 
strategy to eliminate the production of the serine protease. Isp-1. We ultimately introduced the spoOA 
mutation into the triple- and quadruple- protease deficient strains. This feature is useful only when a 
promoter, contained within an expression vector for the production of heterologous gene products in a 

55 Bacillus host, is not a sporulation-specific promoter (e.g. the spoVG promoter). + 
Saturating amounts of chromosomal DNA were prepared from B. subtilis strain JH646 (spoOA. Prot , 
Amy*, Met*) or similar strains having a spoOA mutation, and transformed into competent GP200 cells 
(Spo*. Prof, Amy", Met"). Met* transformants were selected by growth on minimal media plates. 
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Resulting transformants were then screened for co-transformation of the spoOA allele by assaying on 
sporulation medium (Difco) for the sporulation deficiency phenotype, characterised by smooth colony 
morphology and the lack of production of a brown pigment. Approximately 9% of the Met transformants 
appeared to be co-transformed with the spoOA allele; a number of these were rescreened on plates 

5 containing either starch-azure or casein to confirm that the recipients had not also been co-transformed with 
intact amylase or protease genes from the donor DNA. One transformant that did not exhibit detectable 
protease activity was designated GP205 (spoOA, amy E, aprA, nprE). Protease levels produced by this host 
were 0.1 % of the level found in the extracellular fluid of the Spo host, when casein was the substrate. 

In the same manner, the spoOA mutation was introduced into the triple protease deficient mutant 

w GP208 (aprA, nprA, isp-1 A) andlhe quadruple protease deficient mutant GP216 (aprA, nprA, isp-1 A, epr A 
and described below). The resulting Spo" strains are GP210 and GP235, respectively. These strains are 
useful when the expression vector is not based on a sporulation dependent promoter. 



rs Identification of a new protease gene 

We expected that the isolation and cloning of the gene(s) responsible for the remaining protease activity 
would be difficult using conventional methods because cells did not produce large enough amounts of the 
enzyme(s) to detect by the appearance of halos on casein plates. We reasoned that it should be possible to 
20 isolate the gene(s) if it were replicated on a high-copy vector so that the copy number of the gene(s), and 
thus protease production, would be amplified to detectable levels. This strategy enabfed us to isolate a 
novel protease gene from a Bacillus gene bank. The first of these new protease genes has been named epr 
(extracellular protease). Deletion mutants of this new gene were derived in vitro and introduced into the 
Apr" Npr" lsp~ Bacillus host strains by gene replacement methods as described above. 



Cloning the epr gene 

In order to obtain a clone carrying a gene responsible for residual protease activity, a Sau 3A library of 
30 subtilis GP208 DNA was prepared. Chromosomal DNA was isolated, subjected to partial digestion with 
Sau 3A and size-fractionated on an agarose gel. Fragments in the 3-7 kb size range were eluted from the 
gel and cloned into the Bglll site of pEc224, a shuttle vector capable of replicating in both E. coli and 
Bacillus (derived by ligating the large EcoRI- Pvu ll fragment of pBR322 with the large EcoRI-Pyull fragment 
of pBD64 (Gryczan et ai. t 1978, PNAS 75:1428)). The ligated DNA was transformed into E. colTJM107 and 
35 plated on media containing casein. None of the 1200 E. coli colonies produced halos on casein plates, 
however by restriction analysis of the purified plasmid DNA, approximately 90% of the clones contained 
inserts with an average size of about 4 kb. The clones were transformed into a Bacillus host to screen for 
protease activity as follows. E. coli transformants were pooled in twelve groups of 100 colonies each (G1- 
G12). The pooled colonies were grown in liquid media (LB + 50 ug/ml ampicillin), plasmid DNA was 
40 isolated, transformed into B. subtilis GP208 (aprA, nprA, isp-1A) and plated on casein plates. Halos were 
observed around approximately 5% of transformants from pool G11. Plasmid DNA was isolated from each 
of the positive colonies and mapped by restriction enzyme digestion. All of the transformants contained an 
identical insert of approximately 4 kb (Fig. 5). One of these plasmids was selected and named pNP1. 

45 

Characterization of epr protease activity 

The residual protease activity remaining in GP208 (aprA, nprA, isp-1 A) cultures accounted for only a 
small percentage of the total protease activity produced by the host. In order to characterize the type of 

so protease encoded by the epr gene, the effect of different inhibitors on the protease secreted by B. subtilis 
GP208/pNPl was examined" 

Culture media was obtained two hours into stationary phase and assayed using 1 *C-casein as the 
substrate. The level of protease activity present in GP208 was not high enough to detect in the standard 
protease assay described above, however, appreciable protease activity was detected in the culture 

55 medium of GP208/pNP1, carrying the amplified epr gene. The epr protease activity was inhibited in the 
presence of both 10 mM EDTA and 1mM PMSF suggesting that it encodes a serine protease which 
requires the presence of a cation for activity. (Isp-1, another serine protease, is also inhibited by EDTA and 
PMSF.) 



8 



# 



EP 0 369 817 A2 



Subcloning the epr gene 

A 2.7 kb Hpal-Sail subfragment was isolated from the pNP1 insert and cloned into pBs81/6, a derivative 
of pBD64 (derived!^ changing the Pvull site to a Hindlll site using synthetic linkers). Transformants 

5 carrying this subcloned fragment were capable of producing halos on casein plates, indicating that the 
entire protease gene was present within this fragment. A representative clone was named pNP3. 

The location of the gene within the pNP3 insert was further defined by subcloning a 1.6 kb EcoRV 
subfragment into pBs81/6 and selecting for the colonies producing halos on casein plates. A clone which 
produced a halo, and which also contained the 1.6 kb insert shown in Fig. 5, was designated pNP5. The 

w presence of the protease gene within this fragment was confirmed by deleting this portion of the 4 kb insert 
from pNP1. pNP1 was digested with EcoRV and religated under conditions which favored recircularization 
of the vector without incorporation of the 1 .6 kb EcoRV insert. The DNA was transformed into GP208 and 
colonies were screened on casein plates. Greater than 95% of the transformants did not produce halos, 
indicating that the protease gene had been deleted from these clones. A representative done was selected 

75 and is designated pNP6. (The small percentage of colonies that produced halos were presumed to have 
vectors carrying the native epr gene resulting from recombination between the chromosomal copy of the 
gene and homologous sequences within the plasmid.) 

20 Nucleotide and deduced amino acid sequence of the epr gene 

Subcloning and deletion experiments established that most of the protease gene was contained on the 
1.6 kb EcoRV fragment (Fig. 5). Determination of the nucleotide sequence of the 1.6 kb EcoRV fragment 
(Fig. 6)"revealed an open reading frame which covered almost the entire fragment starting 450 bp from the 

25 left end and proceeding through the right end (see Fig. 2). Comparison of the deduced amino acid 
sequence with other amino acid sequences in GENBANK indicated that the protein encoded by the ORF 
had strong homology (approximately 40%) to both subtilisin (Stahl et al.. 1984, J. Bact, 158:411) and lsp-1 
(Koide et al., 1986, J. Bact.. 167:110) from B. subtilis 168. The most probable initiation codon for this 
protease gene is the ATG at position 1 in Figure 6. This ATG (second codon in the ORF) is preceded by an 

30 excellent consensus B. subtilis ribosome binding site ( AAAGGAGATGA ). In addition, the first 26 amino 
acids following this methionine resemble a typical & subtilis signal sequence: a short sequence containing 
two positively-charged amino acids, followed by 1 5 hydrophobic amino acids, a helix-breaking proline, and 
a typical Ala X Ala signal peptidase cleavage site (Perlman et al., 1983, J. Mol. BioL, 167:391). 

Sequence analysis indicated that the ORF continued past the end of the downstream EcoRV site, even 

as though the 1 .6 kb EcoRV fragment was sufficient to encode Epr protease activity. To map the 3 end of the 
gene, the DNA sequence of the overlapping Kpn l to Sail fragment was determined (Fig. 6). As shown in 
Figure 2, the end of the ORF was found 717 bp downstream of the EcoRV site and the entire epr gene was 
found to encode a 645 amino acid protein, the first approximately 380 amino acids of which are 
homologous to subtilisin (Fig. 6). The C-terminai approximately 240 amino acids are apparently not 

40 essential for proteolytic activity since N-terminal 405 amino acids encoded in the 1 .6 kb EcoRV fragment 
are sufficient for protease activity. 

Structure of the epr protein 



In vitro transcription-translation experiments were used to confirm the size of the protein. Plasmid pNP3 
DNA"(containing the 2.7 kb Hpal-Sall fragment with the entire epr gene) was added to an S30-coupled 
transcription/translation system (New England Nuclear) resulting in the synthesis of a protein of approxi- 
mately 75,000 daltons, (Additional proteins of 60,000 and 34,000 daltons were also observed and presum- 
so ably represented processed or degraded forms of the 75,000 dalton protein.) This size agreed reasonably 
weii with the predicted molecular weight of 69,702 daitons for the primary product based on the deduced 
amino acid sequence. 

The homology between the amino-terminal half of the epr protease and subtilisin suggests that Epr 
might also be produced as a preproenzyme with a pro sequence of similar size to that of subtilisin (70-80 
55 amino acids). If true, and if there were no additional processing, this would argue that the mature Epr 
enzyme has a molecular weight of around 58,000. Examination of culture supernatants, however, indicated 
that the protein has a molecular weight of about 34,000. Comparison by SDS-PAGE of the proteins secreted 
by B. subtilis strain GP208 containing a plasmid with the epr gene (pNP3 or pNP5) or just the parent 
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plasmid alone (pBs81/6) showed that the 2.7 kb Hpal-Sall fragment (Figure 1) cloned in pNP3 directed the 
production of proteins of about 34,000 and 38,000 daltons, whereas the 1 .6 kb EcoRV fragment cloned in 
pNP5 in the same orientation (Fig. 1) directed production of just the 34,000 dalton protein. The two proteins 
appear to be different forms of the Epr protease, resulting from either processing or proteolytic degradation. 
Clearly, the 1.6 kb EcoRV fragment, which lacks the 3' third of the epr gene, is capable of directing the 
production of an active protease similar in size to that observed when the entire gene is present. This 
suggests that the protease normally undergoes C-terminal processing. 

Bacillus strain QP208 containing the epr gene on plasmid pNP3 can be used to overproduce the Epr 
protease, which can then be purified by conventional procedures. 



Location of epr on the B. subtilis chromosome 

To map epr on the B. subtilis chromosome, we introduced a drug-resistance marker into the 
chromosome at the site of the epr gene, and used phage PBS1 -mediated transduction to determine the 
location of the insertion. A 1.3 kb Eco RI fragment containing a chloramphenicol acetyltransferase (cat) gene 
was cloned into the unique Eco RI site on an E. coli plasmid containing the epr gene (pNP2 is depicted in 
Figure 7). The resulting plasmid (pNP7) was used to transform B. subtilis" GP208 ) and chloramphenicol 
resistant transformants were selected. Since the plasmid cannot replicate autonomously in B. subtilis, the 
Cm r transformants were expected to arise by virtue of a single, reciprocal recombination event" between the 
cloned epr gene on the plasmid and the chromosomal copy of the gene. Southern hybridization confirmed 
that the cat gene had integrated into the chromosome at the site of the cloned epr gene. Mapping 
experiments indicated that the inserted cat gene and epr gene are tightly linked to~sacA321 (77% co- 
transduction), are weakly linked to purA1 6 (5% co-transduction), and unlinked to hi sAT7 These findings 
suggest that the epr gene is located near sac A in an area of the genetic map whicrfdbes not contain any 
other known protease genes. 



Construction of epr Deletion Mutant 

To create a mutant Bacillus devoid of protease activity a deletion in the 5 end of the cloned gene was 
constructed and then used to replace the wild type gene in the chromosome. pNP2 was first digested with 
Bam HI, which cleaves at a unique site within the epr gene, then the linear plasmid DNA was treated with 
Bal31 exonuciease for 5 minutes at 32* C, religated and transformed into E. coli JM107. Plasmid DNA was 
isolated from 20 transformants, digested with Eco RI and Hindltl to remove the epr gene insert and analyzed 
by gel electrophoresis. One of the plasmids had a 2.3 kb EcoRI-Hindlll fragment replacing the 2.7 kb 
fragment indicating that approximately 400 base pairs had been deleted from the epr gene sequence. This 
plasmid was designated pNP8 (Fig. 7). This deletion mutant was introduced into B ~subtiHs GP208 by gene 
replacement methods as described above. The cat gene, contained on an Eco Ri fragment from pEccI, was 
introduced into the EcoRI site on pNP8 to create pNP9 (Fig. 7). This E. colTpfasmid was used to transform 
B. subtilis GP208 and Cm r colonies were selected. Most of the transformants produced a very small halo 
and the remaining 30% produced no halos on casein plates. The absence of a halo and therefore protease 
activity resulted from a double crossover between chromosomal DNA and homologous sequences from a 
concatemer of the plasmid DNA; these strains contain the E. coli replicon and cat gene flanked by two 
copies of the deleted epr gene. To screen for a strain that had undergone a recombination event between 
the two copies of the epr gene to resolve the duplication, but which had jettisoned the cat gene and the E. 
coli replicon, a single colony was selected and grown overnight in rich medium without drug selection 
Individual colonies arising from this culture were then screened for drug resistance and about 0.1% of these 
were found to be Cm s . One such strain, QP216, containing deletions within the four protease genes (apr, 
npr , isp-1 and epr ) was selected for further study. 

The deletion in the chromosomal epr gene was confirmed by Southern hybridization. QP216, like the 
Cm r parent strain, failed to produce a halo on casein plates. In liquid cultures, however, 1 *C-casein protease 
assays indicated that the epr mutation alone does not entirety eliminate residual protease activity. A strain 
with deletions in epr, apr, npr, and isp, did not produce significantly (ess protease than a strain with 
mutations in just apr, npr, and isp. Finally, growth and sporulation of the quadruple protease deleted strain 
were assayed using standard laboratory media. No differences were observed in growth in LB medium 
when compared to the wild-type strain. Similarly, no appreciable differences were seen in sporulation 
frequency after growth on DSM medium for 30 hours (1 X 10 8 spores/ml for both GP208 and GP216). 
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Identification of Novel Proteolytic Activities 

Strains of B. subtilis have been deleted for four non-essentia! protease genes, apr, npr, isp-1 and epr. 
These deletions" reduce total extracellular protease levels in culture supernatants of Spo+ hosts by about 

s 96% compared to the wild-type strain, but it is desirable to decrease or eliminate the remaining 4% residual 
protease activity for the production of protease-labile products in Bacillus . 

Using the azacoil assay, we have identified two novel proteases that account for this residual activity in 
GP227, a multiple protease deficient B. subtilis strain (apr A , nprA, eprA, isp-1 A) which also contains a 
gene, sacCT, encoding a requlatory protein. The sacCT gene product functions by enhancing the production 

w of degradative enzymes in Bacillus , including the residual protease activity(s) as described in our European 
Patent Application 86308356.4 (Publication No. EP-A-0227260) the disclosure of which is to be regarded as 
hereby incorporated by reference. Due to enhancement by sacCT, strain Gp227 produces substantially more 
protease activity than GP216, which lacks sacQ*. 

In general, supernatants from cultures of B. subtilis GP227 were concentrated, fractionated by passage 

is over a gel filtration column and assayed for protease activity. Two separate peaks of activity were eluted 
from the column and designated RP-I and RP-II (residual protease) for the larger and smaller molecular 
weight species, respectively. Subsequent analysis of these two peaks confirmed that each accounted for a 
distinct enzymatic activity. The isolation and characterization of the RP-I and RP-II proteins, and the creation 
of a deletion mutation in each of the RP-1 and RP-II genes are described below. 

20 

Isolation and Characterization of RP-I 

A simple and efficient purification scheme was developed for the isolation of RP-i from spent culture 
25 fluids. Cultures were grown in modified MRS lactobacillus media (Difco, with maltose substituted for 
glucose) and concentrated approximately 10-fold using an Amicon CH2PR system equipped with a S1Y10 
spiral cartridge. The concentrated supernatant was dialyzed in place against 50mM MES, 0.4M NaCI, pH 
6.8, and fractionated over a SW3000 HPLC gel filtration column equilibrated with the same buffer. The 
fractions containing protease activity were identified using a modification of the azocoli assay described 
30 above. 

Fractions which were positive for the protease activity, corresponding to the higher molecular weight 
species, were pooled and concentrated using a stirred cell equipped with a YM5 membrane, dialyzed vs. 
50mM MES, 100mM KCI, pH 6.7 and applied to a benzamidine-Sepharose liquid affinity column equili- 
brated with the same buffer. Most of the protein applied to the column (97%) failed to bind to the resin, 

35 however RP-I protein bound quantitatively and was eluted from the column with 250mM KCI. 

SDS-PAGE analysis of the benzamidine purified RP-i revealed that the protein was greater than 95% 
homogeneous, and had a molecular weight of approximately 47,000 daitons. Purification by the above 
outlined procedure resulted in a 140-fold increase in specific activity, and an overall recovery of about 10%. 
Isoelectric focusing gels revealed that RP-I has a pi between 4.4 and 4.7, indicating a high acidic/basic 

40 residue composition. The enzyme has a pH optimum of 8.0 and a temperature maximum of 60* C when 
azocoli is used as the substrate. It is completely inhibited by PMSF, indicating that it is a serine protease, 
but it is not inhibited by EDTA, even at concentrations as high as 50mM. 

RP-I catalyzes the hydrolysis of protein substrates such as denatured collagen and casing as well as 
ester substrates (0 = C-O- vs. 0 = C-N- linkages) such as N-a-benzolyl-L-arginine ethyl ester, phenylalanine 

45 methylester, tyrosine ethyl ester and phenylalanine ethyl ester, but does not catalyze hydrolysis of the 
arginine peptide bond in the synthetic substrate N-a-benzoyl-L-arginine-4-nitranilide. Collectively, these data 
demonstrate that RP-I is a serine endoproteinase that has esterase activity and belongs to the subtilisin 
superfamily of serine proteases. Furthermore, these characteristics indicate that RP-I may be the enzyme 
commonly referred to as Bacillopeptidase F (Boyer et al., 1968, Arch Biochem, Biophys., 128:442 and 

so Rottsch et af., 1983, J. Bact, 155:145). Although Bacillopeptidase F has been reported to be a glycoprotein, 
we have not found carbohydrate to be associated with RP-I. 



Cloning the Gene for RP-I 

The sequence of the amino-terminal 28 amino acids of RP-I was determined by sequential Edman 
degradation on an automatic gas phase sequenator and is depicted in Figure 8. A DNA probe sequence (81 
nucleotides) was synthesized based on the most frequent codon usage for these amino acids in B. subtilis - 
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(Figure 8). The N-terminal amino acid sequence of RP-I contains two tryptophan residues (positions 7 and 
18). Since tryptophan has no codon degeneracy, this facilitated the construction of a probe that was highly 
specific for the gene encoding RP-I. 

High molecular weight DNA was isolated from B. subtilis strain GP216, digested with each of several 
different restriction endonucleases and fragments were separated by electrophoresis through a 0.8% 
agarose gel. The gel was blotted onto a nitrocellulose filter by the method of Southern (supra) and 
hybridized overnight with the 32 P end-labeled synthetic RP-I specific probe under semi-stringent conditions 
(5X SSC, 20% formamide, 1X Denhardts at 37° C). Following hybridization, the blot was washed for one 
hour at room temperature in 2X SSC, 0.1% SDS. 

The RP-I specific probe hybridized to only one band in each of the restriction digests indicating that the 
probe was specific for the RP-I gene. In the Pstl digest, the probe hybridized to a 6.5 kb fragment which 
was a convenient size for cloning and was also large enough to contain most or all of the RP-I gene. 

A clone bank containing Pstl inserts in the 6-7 kb size range was prepared from B. subtilis DNA as 
follows. Chromosomal DNA of strain GP216 was digested with Pstl and separated on a"o.8% agarose gel. 
DNA fragments of 6-7 kb were purified from the gel by electroelution and ligated with Pstl digested pBR322 
that had been treated with calf intestinal phosphatase to prevent recircularization~"of the vector upon 
treatment with ligase. The ligated DNA was transformed into competent E. coli DH5 cells and plated on 
media containing tetracycline. Approximately 3x10* Tet r transformants resulted, 80% of which contained 
piasmids with inserts in the 6-7 kb size range. 

A set of 550 transformants was screened for the presence of the RP-I insert by colony hybridization 
with the 32 P-labeled RP-I specific probe and seven of these transformants were found to hybridize strongly 
with the probe. Plasmid DNA was isolated from six of the positive clones and the restriction digest patterns 
were analyzed with Pstl and Hindlll. All six clones had identical restriction patterns, and the plasmid from 
one of them was designated pCR83. 

Using a variety of restriction enzymes, the restriction map of pCR83 insert shown in Figure 9 was 
derived. The RP-I oligomer probe, which encodes the N-terminal 28 amino acids of the mature RP-I 
protease, was hybridized with restriction digests of pCR83 by the method of Southern (supra). The probe 
was found to hybridize with a 0.65 kb Clal-EcoRV fragment suggesting that this fragmenTcontained the 5 
end of the gene, in order to determine the orientation of the RP-I gene, the strands of the Clal-EcoRV 
fragment were separately cloned into the single-stranded phage M13. The M13 clones were then probed 
with the RP-1 oligomer and the results indicated that the RP-I gene is oriented in the leftward to rightward 
direction according to the map in Figure 9. 

The DNA sequence of a portion of the Pstl insert, as shown in Figure 9, was determined, and an 81 
base pair sequence (underlined in Figure 10) was found that corresponded exactly with the sequence 
encoding the first 28 amino acids of the protein. The Bglll and Clal sites designated in Fig. 10 are identical 
to those designated in Fig. 9 and, in addition, the EcoRV site is identical to that designated in the restriction 
enzyme map shown In Fig. 9. Portions of the untranslated region surrounding the RP-I coding region are 
also shown in Fig. 10; the DNA sequence underlined within the 5 untranslated region corresponds to the 
putative ribosome binding site. 

The DNA sequence revealed an open reading frame that began at position-15 (in Figure 10) and 
proceeded through to position 2270. The most probable initiation codon for this open reading frame is the 
ATG at position 1 in Figure 10. This ATG is preceded by a ribosome binding site (AAAGGGGGATGA), 
which had a calculated AG of -17.4 kcal. The first 29 amino acids following this Met resemble a B, subtilis 
signal sequence, with a short sequence containing five positively-charged amino acids, followed by 16 
hydrophobic residues, a helix-breaking proline, and a typical Ala-X-Ala signal peptidase cleavage site. After 
the likely signal peptidase cleavage site, a "pro" region of 164 residues is followed by the beginning of the 
mature protein as confirmed by the determined N-terminal amino acid sequence. The first amino acid of the 
N-terminus, which was uncertain from the protein sequence, was confirmed as the Ala residue at position 
583-585 from the DNA sequence. The entire mature protein was deduced to contain 496 amino acids with a 
predicted molecular weight of 52,729 daltons. This size was in reasonable agreement with the determined 
molecular weight of the purified protein of 47,000 daltons. In addition, the predicted isoelectric point of the 
mature enzyme (4.04) was in good agreement with the observed pi of 4.4-4.7. GENBANK revealed that the 
RP-I gene is partially homologous (30%) to subtilisin, to ISP-1 and. to a lesser extent (27%), to the epr 
gene product. — 



Cloning the RP-I gene on a multicopy replicon 
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The Pstl fragment was removed from pCR83 and ligated into Psti linearized pBD9, a multicopy Bacillus 
replicon "encoding erythromycin and kanamycin resistances. The ligated DNA was transformed into 
competent QP227 cells (the sacQ* enhancement strain) and kanamycin resistant transformants were 
selected. A plasmid carrying the 6.5 kb Pstl insert was chosen and designated pCR88. 

s To confirm that this insert encoded the RP-I gene. GP227 cells containing pCR88 or pBD9 were grown 
in MRS medium under selective conditions for 50 hours at 37" C. Supernatant samples were collected and 
assayed for protease activity. Supernatants from the pCR88 cultures contained approximately 10-fold more 
protease activity than those from the pBD9 cultures. Furthermore, this secreted protease activity was 
inhibited by PMSF and, when fractionated on a denaturing protein gel, the supernatant from the pCR88 

10 sample contained an extra protein of 47 kd. These results confirmed that the RP-I gene was encoded within 
the 6.5 kb fragment, and that cloning the sequence in a multicopy replicon leads to the overproduction of 
the RP-I protein. 

15 Location of the RP-I Gene on the B. Subtilis Chromosome 

We mapped the location of the RP-I gene (bpr) on the B. subtilis chromosome by integrating a drug 
resistance marker into the chromosome at the site of bpr and using phage PBS1 -mediated transduction to 
determine the location of the cat insertion. A 1.3 kb Smal fragment containing a chloramphenicol 

20 acetyltransferase (cat) gene was cloned into the unique EcoRV site of pCR92 (the 3.0 kb Bglll of pCR83 
cloned into pUC18. The EcoRV site is in the coding region of bpr< (Figure 10). The resulting plasmid, 
pAS112, was linearized by digestion with EcoR1 and then used to transform B. subtilis strain GP216, and 
chloramphenicol-resistant transformants were selected (GP238). Cm r transformants were expected to be the 
result of a double cross-over between the linear plasmid and the chromosome (marker replacement). 

25 Southern hybridization was used to confirm that the cat gene had integrated in the chromosome, 
interrupting the bpr gene. Mapping experiments indicating that the inserted cat gene and bpr were strongly 
linked to pyrD1 (89%) and weakly linked to met C (4%). The gene encoding the neutral protease gene (npr) 
also mapsTn this region of the chromosome, although npr is less tightly linked to pyr (45% and 32%) and 
more tightly linked to met C (18% and 21%) than is bpr. 

30 

Construction of a deleted version of the RP-I gene 

An internal deletion in the RP-I sequence was generated in vitro . Deletion of the 650 bp sequence 

35 between the Clal and EcoRV sites in the pCR83 insert removed the sequence encoding virtually the entire 
amino-terminaThaif of the mature RP-I protein. The deletion was made by the following procedure. 

The 4.5 kb Pstl-EcoRl fragment of PCR78 (a pBR322 clone containing the 6.5 kb Pstl fragment) was 
isolated and ligated topUC18 (a vector containing the E. coH lacZ gene encoding jS-galactosidase) that had 
been digested with Eco Ri and Psti. The ligation mix was then transformed into E. coli DH5 cells. When 

40 plated onto LB media containing Xgal and ampiciliin, eight white colonies resulted, indicating insertion of the 
fragment within the gene encoding jS-galactosidase. Plasmid DNA prepared from these colonies indicated 
that seven of the eight colonies contained plasmids with the 4.5 kb insert. One such plasmid. pKT2, was 
digested with EcoRV and Clal, treated with Klenow fragment to blunt the Clal end and then recircularized by 
self-ligation. TheTigated DNA was then transformed into E. coli DH5 cells. Approximately 100 transformants 

45 resulted and plasmid DNA was isolated from Amp r transformants and analyzed by restriction digestion. 
Eight of eight clones had the Clal-EcoRV fragment deleted. One such plasmid was designated pKT2 . The 
cat gene, carried on an Eco RI fragment from pEccI was then lipated into pKT2' for use in selecting Bacillus 
integrants as described above. To insert the cat gene, pKT2 was digested with EcoRI, treated with calf 
intestine alkaline phosphatase and ligated to a 1.3 kb EcoRI fragment containing the cat gene. The ligated 

so DNA was transformed into DH5 cells and the Amp r colonies that resulted were patched onto LB media 
containing chloramphenicol. Two of 100 colonies were Cm r . Plasmid DNA was isolated from these two 
clones and the presence of the 1 .3 kb cat gene fragment was confirmed by restriction enzyme analysis of 
plasmid DNA. One of these plasmids, pKT3, was used to introduce the deleted gene into strain GP216 by 
gene replacement methods. 

55 The DNA was transformed into GP216 and chloramphenicol resistant colonies were selected. 
Chromosomal DNA was extracted from 8 Cm* colonies and analyzed by Southern hybridization. One clone 
contained two copies of the deleted RP-I gene resulting from a double crossover between homologous 
sequences on the vector and in the chromosome. The clone was grown in the absence of chloramphenicol 
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selection and was then replica plated onto TBAB media containing chloramphenicol. One Cm s colony was 
isolated and Southern analysis confirmed that the deleted gene had replaced the wild-type RP-I gene in the 
chromosome. This strain was designated GP240. Analysis of supernatants from cultures of GP240 
confirmed the absence of RP-I activity. 

5 

Isolation and Characterization of RP-II 

The purification scheme for RP-II was more extensive than for RP-l because RP-II failed to bind 

10 benzamidine-Sepharose or other protease-affinity resins, e.g., arginine-Sepharose and hemoglobin-agarose, 
and we thus found it necessary to use more conventional purification techniques such as ion exchange 
chromatography, gel filtration and poiyacrylamide gel electrophoresis. 

Concentrated crude supernatants of GP227 cultures were fractionated over DEAE-Sephacel (anion 
exchange) equilibrated at pH 6.8. At this pH the RP-II protein failed to bind the resin; however, approxi- 

rs mately 80% of the total applied protein, including RP-I, bound the resin and was thus removed from the 
sample. The column eluate was then fractionated by cation exchange chromatography using CM-Sepharose 
CL-6B equilibrated at pH 6.8. RP-II was capable of binding to the resin under these conditions and was then 
eiuted from the column with 0.5 M KCI. To further enhance the resolution of the cation exchange step, the 
RP-II eluate was then refractionated over a 4.6 x 250 mm WCX (weak cation exchange) HPLC column 

20 developed with a linear gradient of NaCI. The WCX pool was then size-fractionated over a TSK-125 HPLC 
column. The RP-II peak was then fractionated a second time over the same column yielding a nearly 
homogeneous preparation of RP-II when analyzed by SDS-PAGE. The protease was purified over 6900-fold 
and represented approximately 0.01% of the total protein in culture fluids of GP227. Alternatively, 
approximately 30 fold more RP-II can be purified from a Bacillus strain that is RP-I" and contains the sacQ* 

25 enhancing sequence (U. S.S.N. 921,343, assigned to the same assignee and hereby incorporated by 
reference), since the quantity of RP-II produced by such a strain is substantially increased, representing 
about 0.3% of total protein in the culture fluid. 

RP-II was insensitive to PMSF treatment, and therefore is not a serine protease. SDS-PAGE analysis 
indicated that RP-II has a molecular mass of 27.3 kd. The failure of RP-II to bind DEAE at pH 6.7 and PAE- 

30 300 (an HPLC anionic column) at pH 8.3 indicated that the protein has a basic isoelectric point which is 
greater than 8.3 (pi = 8,7 by chromatofocusing). RP-II is highly sensitive to dithiothreitol (DTT, a sulfhydryl 
reducing agent), being quantitatively inhibited at levels as low as 1 mM in the azocolt assay. RP-II is also 
sensitive to combinations of other sulfhydryl reagents with metal chelators (i.e., mercaptoethanol with 
EDTA). Inhibition of proteases by sulfhydryl reagents is relatively rare and has only been described for a 

35 few proteases, such as collagenase from C. histolyticum and carboxypeptidase A. RP-II also possesses 
esterase activity as demonstrated by its ability to hydrolyze phenylalanine methyl ester and n-t-BOC-L- 
glutamic acid-o-phenyl ester. 

In order to obtain the cleanest possible sample of RP-II for sequence analysis, a final purification step 
was used which involved separation by poiyacrylamide gel electrophoresis. Following electrophoresis, 

40 proteins were transferred electrophoretically from the gel to a sheet of polyvinylidene difluoride (PVDF) 
membrane. RP-II was visualized on the hydrophobic membrane as a "wet-spot" and the corresponding area 
was cut from the sheet and its amino-terminal amino acid sequence determined. 

The sequence of the 15 amino acid terminal residues of RP-II (Ser-lle-lle-Gly-Thr-Asp-Glu-Arg-Thr-Arg- 
lle-Ser-Ser-Thr-Thr-) is rich in serine and arginine residues. Since both serine and arginine have a high 

45 degree of codon degeneracy, this increased the difficulty in creating a highly specific probe. Therefore, 
additional amino acid sequence information was obtained from internal peptides that contained one or more 
non-degenerate amino acid residues. 



so Sequence Analysis of Internal Peptide Fragments of RP-II 

Tryptic peptides from purified RP-II were produced and isolated using reverse-phase HPLC. Since each 
of the amino acids tryptophan and methionine is encoded by only one amino acid codon, a synthetic 
nucleotide probe, or "guess-mer" that encodes one or more of either of these amino acids will be highly 
55 specific for its complementary nucleotide sequences. 

An HPLC chromatogram of the RP-II trypsin digested mixture was monitored at three wavelengths: 210 
nm (peptide bonds), 227 nm (aromatic residues, i.e., phenylalanine, tyrosine, tryptophan), and 292 nm 
(conjugated ring structure of tryptophan). The 292 nm trace was used to identify peptides of RP-II that 
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contain a tryptophan residue. The 210 nm trace was used to obtain baseline resolved (i.e., single-species 
peptides) fragments for sequence analysis. Based on the 210 nm and 292 nm traces, three fragments were 
chosen for sequence analysis: T90, T94, and T92. Guess-mer oligomers were then synthesized based on 
the amino acid sequences of these fragments. 

s Figure 11(a) is the amino-terminal sequence obtained for RP-II fragment T90. A total of 15 residues 
were obtained, 67% of which have only one or two possible codons. The specificity of a probe (BRT90) 
constructed based on the sequence of fragment T90 was enhanced by the presence of a predicted 
tryptophan residue (position 12). The number in parentheses at each position represents the possible 
number of codons for each residue. 

io The amino-terminal sequence of RP-II fragment T94 is shown in Figure 11(b). Of the 30 residues 
determined, none were found to be tryptophan. Although only 36% of the residues (numbers 1-25) have two 
possible codons, the length of the corresponding 75-mer probe (707) renders it useful for corroborating 
hybridization experiments conducted with the T90 probe. 

The third and final probe was constructed based on sequence information obtained from RP-II fragment 

75 T92 (Fig. 11(c)). Because of the relatively high degree of degeneracy at the beginning and end of this 
sequence, a probe was constructed based on residues 15-27. The resulting 39-mer probe (715) codes for a 
peptide of which half the residues have only one or two possible codons. Furthermore, the specificity of this 
probe was enhanced by the tandem location of a methionine and tryptophan residue at positions 26 and 27. 



Cloning of RP-II 

Chromosomal DNA was cut with various restriction enyzmes and a series of hybridizations using the 
radiolabeled oligomer probes BRT90 and 707 were performed. Both probes were labelled with 32 P and 

25 hybridized to a Southern blot of GP241 DNA digested with Bam HI, Bgl ll, Hindi, Pstl. or EcoRl under semi- 
stringent conditions (5 x SSC, 10% formamide, 1 x Denhardt's, 100 ug/ml denatured salmon sperm DNA at 
37° C). After hybridization for 18 hours, the blots were washed with 2 x SSC, 0.1% SDS for one hour at 
37* C, and then washed with the same buffer at 45* C for one hour. The results are shown in Fig. 12. Both 
probes hybridized to the same restriction fragments: Hindi, -1 kb: Pstl, 3-4 kb, and EcoRI, 6-7 kb. The 

30 probes also hybridized to very large fragments in the Bam HI and Bglll-digested DNAs. 

Pstl fragments of 3-4 kb were used to construct a DNA library, as follows. pBR322 was digested with 
Pstl~and treated with CIAP. Size-selected Pstl-digested GP241 chromosomal DNA of 3-4.5 kb was 
electroeluted from a 0.8% agarose gel. Approximately 0.1 ug of Pstl-cut pBR322 and 0.2 ng of the size- 
selected DNA was ligated at 16* C overnight The ligated DNA was then transformed into E. coli DH5 cells. 

35 Approximately 10,000 colonies resulted, of which 60% contained plasmids with the insert DNA. 1400 
colonies were patched onto LB plates containing 15 ug/ml tetracycline with nitrocellulose filters. After 
colonies were grown at 37° C overnight, the filters were processed to lyse the colonies, denature the DNA, 
and remove cell debris. The filters were then baked at 80 " for two hours. Colony hybridization was 
performed using radiolabeled probe 707. Hybridization conditions were ' identical to those used in the 

40 Southern blot experiments. Analysis of the plasmid DNA from four positive colonies identified one as 
containing plasmid DNA that contained a 3.6 kb insert which strongly hybridized to both probes. The 
plasmid, pLP1, is shown in Fig. 13(b). 

A restriction map of pLP1 (Fig. 13(a)) was constructed using a variety of restriction endonucleases to 
digest pLP1, transferring the size-fractionated digests onto nitrocellulose, and probing the immobilized 

45 restriction fragments with the radiolabeled oligomers described above. It was determined that all three 
oligomers, which encode a total of 53 amino acids within the RP-II protein, hybridized with the 1.1 kb Hindi 
fragment. 

The 1.1 kb Hindi fragment was isolated and cloned into M13mp18. A phage clone containing the Hindi 
fragment was identified by hybridization with one of the oligomer probes. The DNA sequence of the Hind i 

so fragment revealed an open reading frame that spanned most of the fragment (position -24 to position 939 in 
Figure 14). The most probable initiation codon for this open reading frame is the ATG at position 1 in Figure 
14. This ATG is preceded by a B. subtilis ribosome binding site (AAAGGAGG), which has a calculated AG 
of -16.0 kcaL The first 33 amino~acids following this Met resembled a B. subtilis signal sequence, with a 
short sequence containing four positively-charged amino acids, followed by 18 hydrophobic residues, a 

55 helix-breaking proline, and a typical Ala-X-Ala signal peptidase cleavage site. After the presumed signal 
peptidase cleavage site, a "pro" region of 58 residues is found, followed by the beginning of the mature 
protein as determined by the N-terminai amino acid sequence of the purified protein. The amino terminal 16 
residues are underlined and designated "N terminus". Amino acid sequences from which the three guess- 
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mers were deduced are also underlined and designated T94, T92, and T90. The determined amino acid 
sequences of the peptides matched the deduced amino acid sequence except for a serine residue encoded 
by nucleotides 379-381 and a cysteine residue encoded by nucleotides 391-393. The determined amino 
acid sequence predicted a cysteine residue (position 14, T94 peptide) and an asparagine residue (position 

5 18, T94 peptide), respectively (Figure 11). The entire mature protein was deduced to contain 221 amino 
acids with a predicted molecular weight of 23,941 daltons. This size was in approximate agreement with the 
determined molecular weight of the purified protein 28,000 daltons. 

The deduced amino acid sequence showed only limited homology to other sequences in GENBANK. 
The strongest homology was to human protease E and bovine procarboxypeptidase A in a 25 amino acid 

io sequence within RP-II (131-155, encoded by nucleotides 391-465; Figure 14). 

To further confirm the identity of the RP-II gene, the 3.6 kb Pstl fragment was engineered onto a multi- 
copy Bacillus replicon to test for overproduction of the RP-II protein. For this purpose the Bacillus plasmid 
pBs81/6 (Cm r , Neo r ) was inserted into the E. coli clone containing the RP-II gene. Plasmid pLP1 (8.0 kb) 
was digested with Eco RI, which cuts at a single site outside the Pstl insert, and ligated to Eco Rl-digested 

75 pBs81/6 (4.5 kb; Fig. 13(a)). The resulting plasmid (pCR130) was used to transform GP24l""and chloram- 
phenicol or neomycin-reststant transformants were selected. Supernatant samples from cultures of the 
transformants were found to contain 3-4 fold more azocoll-hydrolyzing activity than the supernatants from 
cells containing only the plasmid pBs81/6, indicating that the gene for RP-II is wholly contained within the 
3.6 kb Pstl fragment. 

20 



Location of the RP-II Gene on the B. subtilis chromosome 

In order to map the RPII gene (mpr) on the B. subtilis chromosome, we used B. subtilis strain GP261 
25 described below which contained the cat gene inserted into the chromosome at the site of the mpr gene 
and used phage PBS1 transduction to determine the location of the cat insertion. 

Mapping experiments indicated that the inserted cat gene and mpr were linked to cysA14 (7% co- 
transduction) and to aro!906 (36% co-transduction) but unlinked to purA16 and dal. This data indicated that 
the mpr gene was between cysA and arol in an area of the genetic map not previously known to contain 
30 protease genes. 



Deletion of the RP-II Gene on the Bacillus Chromosome 

35 As described above for the other Bacillus subtilis proteases, an RP-II Bacillus deletion mutant was 
constructed by substituting a deleted version of the RP-II gene for the complete copy on the chromosome. 
To ensure the deletion of the entire RP-II gene, a region of DNA was deleted between the two Hpal sites in 
the insert (Fig. 13(a)). This region contains the entire 1.1 kb Hindi fragment and an additional 0.9 kb of DNA 
upstream of the Hindi fragment. 

40 To create the deletion, plasmid pLP1 (the pBR322 clone containing the 3.6 kb Pstl fragment) was 
digested with Hpa l and size-fractionated on an agarose gel. Digestion of pLP1 results in the release of the 2 
kb internal Hpa l fragment and a larger Hpal fragment containing the vector backbone and segments that 
flank the Pstl insert (Fig. 13(c)). The larger Hpa l fragment was purified and ligated with purified blunt-ended 
DNA fragments containing either the chloramphenicol-resistance ( cat ) gene from pMI1101 (Youngman et ai , 

45 1984, supra ) or the bleomycin resistance (ble) gene from pKT4, a derivative of pUB110 (available from the 
Bacillus Stock Center, Columbus Ohio). 

The cat gene was isolated as a 1 .6 kb Sma l fragment from pEcd . This DNA was ligated to the isolated 
large Hpa l fragment of pLP1. The ligated DNA was then transformed into E. coli DH5 cells. Approximately 
20 Tef colonies resulted. One colony was found to be Cm r when the colonies were patched onto LB 

so medium + 5 ug/ml chloramphenicol. Analysis of the plasmid DNA from this colony confirmed the presence 
of the cat gene. This plasmid was called pLP2. 

Plasmid pLP2 (Fig. 13(c)) was digested with Pstl and then transformed into GP241. This transformation 
gave approximately 280 Cm r colonies; one colony~was chosen for further study (GP261). Competent cells of 
GP261 were prepared and then transformed with pDP104 (sacQ"); 10 Tef colonies resulted. Four colonies 

55 were grown in MRS medium and the presence of sacQ* was confirmed by elevated levels of aminopep- 
tidase. This strain was called GP262. 

Since the cat gene was often used to select other vectors, a different antibiotic resistance was also 
used to mark the deletion of the RP-II gene on the Bacillus chromosome; i.e., the bleomycin-resistance 
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gene of pUBHO. The ble gene was isolated from plasmid pKT4, a derivative of pUB110, as an EcoRV- Sma l 
fragment and ligated to the purified large Hpal fragment (Fig. 13(c)) before tranformation into E. coii DH5 
cells; tetracyciine-resistant transformants were selected and then screened for resistance to phleomycin, a 
derivative of bleomycin, by patching onto TBAB plates containing phleomycin at a final concentration of 2 

5 ug/mi. Of 47 Tet r transformants so screened, seven were also phleomycin-resistant The insertion of the ble 
gene was confirmed by restriction analysis of the plasmids isolated from these clones. One of these 
plasmids, pCRl25 (Fig. 13(c)), was used to introduce the deleted gene containing the ble gene marker into 
the strain GP241 by gene replacement methods, as described below. 

Plasmid pCR125 was digested with EcoRI and the linear plasmid DNA was used to transform GP241 to 

io phleomycin resistance. Resistant transformants were selected by plating the transformed ceils onto TBAB 
agar plates containing a gradient of 0-5 ug/ml phleomycin across the plate. Transformants that were 
resistant to approximately 2.5 ug/ml phleomycin on the plates were single-colony purified on TBAB 
phleomycin plates and thereafter grown on TBAB without selective antibiotic (strain GP263). 

The strains bearing the RP-II deletion and the cat or ble insertion in the RP-II gene, along with the 

75 positive regulatory element. sacQ*. were evaluated for extracellular enzyme production, particularly protease 
and esterase activities. 

The data given in Table 1, below, indicate that the presence of sacQ* in B. subtilis strain GP239, which 
bears null mutations in the five protease genes apr (subtilisin), npr (neutral protease), epr (extracellular 
protease), isp (internal serine protease), and bpr, enhanced production of the RP-II protease (which also has 

20 esterase activity). To assess the influence on protease production of deleting RP-II from strains of B. subtilis 
bearing the sacQ* regulatory element, the following experiments were performed. 

Independent clones of the RP-II deletion strain GP262 were shown to produce negligible amounts of 
esterase activity and no detectable levels of endoprotease activity using azocoll as substrate (Table I). To 
confirm the absence of protease activity, culture supernatants from GP262 were concentrated to the extent 

25 that the equivalent of 1 ml of supernatant could be assayed. Even after 2.5 hours incubation of the 
equivalent of 1 ml of supernatant with the azocoll substrate, there was no detectable protease activity in the 
deleted RP-H strain. By comparison, 50ul of supernatant from GP239 typically gave an As 20 in the azocoll 
assay of over 2.0 after a one hour incubation at 55 *C. (The presence of sacQ* was confirmed by 
measurement of the levels of aminopeptidase present in the culture fluids of this strain, which were 50-80 

30 fold higher than in analogous strains lacking sacCT.) Thus, deletion of the two residual proteases, RP-I and 
RP-II, in Bacillus yields a strain that is largely incapable of producing extracellular endoproteases, as 
measured using azocoll as a substrate under the conditions described above. 

Table 1 

35 



Strain 


Aminopeptidase 


Protease 


Esterase 


(U/ml) 


(U/ml) 


(U/ml) 


GP238 


0.04 


0.13 


0.02 


GP239 


1.7 


84 


1.16 


GP262, Al 


2.9 


ND 


0.08 


GP262, All 


3.4 


ND 


0.11 


GP262, Bl 


1.9 


ND 


0.10 


GP262.B1I 


2.5 


ND 


0.10 



Aminopeptidase was measured using L-leucine-p-nitroanilide as substrate (1 unit = umols substrate 
hydrolyzed/minute). Protease was measured using the standard azocoll assay (1 unit = AA520 of 0.5/hour). 

so Esterase was measured using N-t-BOC-glutamic acid-a-phenyl ester as substrate (1 unit = umols substrate 
hydrolyzed/minute). Strain GP238 has the genotype Aapr, Anpr, Aepr, Aisp, Arp-1 ; strain GP239 has the 
genotype Aapr, Anpr, Aepr, Aisp, Arp-1, sacCT; and GP262 Al, All, Bl, and Bll are independent clones of 
GP262 containing sacQ* and a cat insertional deletion in RP-II. ND means not detectable. 

Referring to Table 2, several protease-deficient strains were also tested for protease activity using the 

55 more sensitive resorufin-labeiled casein assay described earlier. As is shown in Table 2, although the strain 
GP263. deleted for six protease genes, exhibited no detectable protease activity in the azocoll test, such 
activity was detected in the resorufin-labeiled casein test. GP271, the spoOA derivative of GP263, exhibited 
no detectable protease activity in either test, indicating that the prior protease activity detected in GP263 
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may be under speculation control. The minor casein-detectable activity present in culture fluids of GP263 
apparently belongs to the serine protease family, because of its sensitivity to inhibition by PMSF. In the 
presence of PMSF, no detectable protease activity was present in cultures of GP263. 

s Table 2 







Remaining 






activity 






(%of wiid-type 






att 2 o) 


Strain 


Genotype 


1 


2 


IS75 


Wild-type 


100 


100 


GP202 


Aapr, Anpr, amyE 


5 


8 


GP208 


Aapr, Anpr, A/sp- /, amyE, metr 


5 


8 


GP263 


Aapr, Anpr, A/sp- /, Aepr, Abpr, Ampr, Ahpr, amyE, met" 


ND 


0.5-1 


GP271 


spoOA, Aapr r Anpr, A/sp- /, A epr, Abpr, Ampr, Ahpr, amyE, 


ND 


ND 




mer 






1 As measured using azocoll as substrate. 






2 As measured using resorufin casein as substrate. 







25 Other embodiments are feasible. 

For example, in some instances it may be desirable to express, rather than mutate or delete, a gene or 
genes encoding protease(s). 

This could be done, for example, to produce the proteases for purposes such as improvement of the 
cleaning activity of laundry detergents or for use in industrial processes. This can be accomplished either 

30 by inserting regulatory DNA (any appropriate Bacillus promoter and, if desired, ribosome binding site and/or 
signal encoding sequence) upstream of the protease-encoding gene or, alternatively, by inserting the 
protease-encoding gene into a Bacillus expression or secretion vector; the vector can then be transformed 
into a Bacillus strain for production (or secretion) of the protease, which is then isolated by conventional 
techniques. Alternatively, the protease can be overproduced by inserting one or more copies of the 

35 protease gene on a vector into a host strain containing a regulatory gene such as sacG". 



Claims 

40 1. A Bacillus cell characterised in containing a mutation in the epr gene resulting in inhibition of the 
production by said cell of proteolytically active epr gene product. 

2. A Bacillus cell according to Claim 1, characterised in further containing a mutation in the RP-I- 
encoding gene, said mutation resulting in inhibition of the production by said cell of proteolytically active 
RP-I. 

4 & 3. A Bacillus cell characterised in containing a mutation in the RP-l-encoding gene resulting in inhibition 
of the production by said cell of proteolytically active RP-I. 

4. A Bacillus cell according to any preceding claim, characterised in further containing a mutation in the 
RP-II encoding gene, resulting in inhibition of the production by said cell of proteolytically active RP-II. 

5. A Bacillus cell characterised in containing a mutation in the RP-ll-encoding gene resulting in inhibition 
50 of the production by said cell of proteolytically active RP-II. 

6. A Bacillus ceil according to any preceding claim, characterised in further containing mutations in the 
apr and npr genes encoding extracellular proteases, said mutations resulting in inhibition of the production 
by said cell of said encoded proteolytic activities. 

7. A Bacillus cell according to any preceding Claim 11, further characterised in that the or each said 
55 mutation comprises a deletion within thee coding region of the gene. 

8. A Bacillus cell according to any preceding claim, further containing a mutation in the isp-1 gene 
encoding an intracellular protease. 

9. A Bacillus cell according to any of Claims 1 to 7, characterised in further containing a mutation which 
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reduces said cell's capacity to produce one or more sporulation-dependent proteases. 

10. A Bacillus cell according to Claim 9, further characterised in that said sporulation-dependent 
protease mutation blocks sporulation at an early stage but does not eliminate the cell's ability to be 
transformed by purified DNA. 
s 11. A Bacillus cell according to Claim 10, further characterised in that said sporulation-dependent 

protease mutation is in the spo OA gene. 

12. A Bacillus cell according to any preceding cflaim, further characterised in being a Bacillus subtilis 

cell. 

13. A Bacillus cell according to any preceding claim, characterised in further comprising a gene 
10 encoding a heterologous polypeptide. 

14. A cell according to Claim 13, further characterised in that said heterologous polypeptide is a 
medically useful protein, preferably a hormone, vaccine, antiviral protein, antitumour protein, antibody or 
clotting protein. 

15. A cell according to Claim 13, further characterised in that said heterologous polypeptide is an 
75 agriculturally or industrially useful protein, preferably a pesticide or enzyme. 

16. A method for producing a heterologous polypeptide in a Bacillus cell, characterised in comprising: 
introducing into said cell a gene encoding said heterologous polypeptide, modified to be expressed in said 
cell, said Bacillus cell containing mutations in the apr and npr genes, and further containing mutations in 
one or more of the genes encoding the Epr protease, RP-I, or RP-ll. 

20 17. A method according to Claim 16, characterised in further containing a mutation in the isp-1 gene 
encoding intracellular protease I. 

18. A method according to Claims 16 or 17, further characterised in that said heterologous polypeptide 
is normally unstable in a Bacillus cell. 

19. A method according to any of Claims 16, 17 or 18, further characterised in that said cell is a 
25 Bacillus subtilis cell. 

207a method according to any of Claims 16 to 19, further characterised in that said cell further contains 
a mutation which reduces said cell's capacity to produce one or more sporulation-dependent proteases, 
said mutation being in thee spo OA gene. 

21. A method according to any of Claims 16 to 20, further characterised in that said heterologous 
30 polypeptide is a medically useful protein, or an agriculturally or industrially useful protein. 

22. Purified DNA comprising a Bacillus epr gene. 

23. Purified DNA comprising a Bacillus gene encoding RP-I. 

24. Purified DNA comprising a Bacillus gene encoding RP-ll. 

25. A vector comprising a Bacillus epr gene and requlatory DNA operationally associated with said 
35 gene. 

26. A vector comprising a Bacillus gene encoding RP-I and regulatory DNA operationally associated 
with said gene. 

27. A vector comprising a Bacillus gene encoding RP-ll and regulatory DNA operationally associated 
with said gene. 

40 28. A Bacillus cell transformed with a vector according to any of Claims 25, 26 or 27. 

29. Substantially pure Bacillus Epr protease. 

30. Substantially pure Bacillus residual protease I (RP-I). 

31 . Substantially pure Bacillus residual protease II (RP-ll). 



so 
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FIG. 3 
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FIG. 5 
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FIG. 6-1 

-446 ATCC6AGCTTATC6GCCCACTCGTTCCCAAACACACTCGCCATGAAATCAGCATACCCC 
GGAATCGGCAAGCTCGTTAAAATCAAGAAGACAGACCCGATAATAATCAGCGGCATGGT 
CAGGATAATCCGTCACGCAAAGCGCTGAGATGCCGCTGCCCGGCAATTTTCCCGGCGAC 

AGGCATTATTTTTTCCTCCATCACCCGAGTGAATGTGCTCATCTTAAAAACCCCCTTTT 
CTCATTGCTTTGTGAACAACCTCCGCAATGTTTTCTTTATCTTATTTTGAAAACGCTTA 
CAAATTCATTTGGAAAATTTCCTCTTCATGCGGAAAAAATCTGCATTTTGCTAAACAAC 
CCTGCCCATGAAAAATTTTTTCCTTCTTACTATTAATCTCTCTTTTTTTCTCCGATATA 
TATATCAAACATCATAGAiMA££A»I£MTC 



+1 ATG AAA AAC ATG TCT TGC AAA 
met lys dsn met ser cys lys 

46 TTC AGT TTT CTC ACC ATA GGC 
phe ser phe leu thr lie gly 

91 AGC GAG AAA GAG GTT ATT GTG 
ser glu lys glu vdl lie vdl 

136 GAA ACC ATC CTG GAC AGT GAT 
glu thr lie leu asp ser asp 

181 CAT CTT CCC GCG GTA GCG GTC 
his leu pro a/a vdl aid vdl 

BdmHI 

226 GAA TTA AAG CAG GAT CCT GAT 
glu leu lys gin asp pro asp 

271 TCA TTT ACC GCA GCA GAC AGC 
ser phe thr a/a did asp ser 

316 GGC ACT GAC ACC TCT GAC AAC 
gly thr asp thr ser asp dsn 

361 ATT CAG GTG AAA CAG GCT TGG 
lie gin vdl lys gin did trp 
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FIG. 6-2 

406 ATC AAA ATT GCC GTC ATT GAC AGC GGG ATC TCC CCC CAC GAT GAC 
lie lys lie ala val lie asp ser gly 1le ser pro his asp dsp 

451 CTG TCG ATT GCC GGC GGG TAT TCA GCT GTC AGT TAT ACC TCT TCT 
leu ser lie ala gly gly tyr ser ala val ser tyr thr ser ser 

496 TAC AAA GAT GAT AAC GGC CAC GGA ACA CAT GTC GCA GGG ATT ATC 
tyr lys asp asp asn gly his gly thr Ms val ala gly lie lie 

541 GGA GCC AAG CAT AAC GGC TAC GGA ATT GAC GGC ATC GCA CCG GAA 
gly ala lys his asn gly tyr gly lie asp gly lie ala pro glu 



586 GCA CAA ATA TAC GCG GTT AAA 
ala gin 1le tyr ala val lys 

631 GAT CTT CAA AGT CTT CTC CAA 
asp leu gin ser leu leu gin 

676 AGG ATG GAC ATC GTC AAT ATG 
arg met asp 11 e val asn met 

721 AAA ATC CTT CAT GAC GCC GTG 
lys 11e leu his asp ala val 



GCG CTT GAT CAG AAC GGC TCG GGG 
ala leu asp gin asn gly ser gly 

GGA ATT GAC TGG TCG ATC GCA AAC 
gly 1le asp trp ser lie ala asn 

AGC CTT GGC ACG ACG TCA GAC AGC 
ser leu gly thr thr ser asp ser 

AAC AAA GCA TAT GAA CAA GGT GTT 
asn lys ala tyr glu gin gly val 



766 CTG CTT GTT 
leu leu val 

811 AAT TAT CCG 
asn tyr pro 

856 AAC GAA AAG 
asn glu lys 

901 GTT GAA TTT 
val glu phe 



GCC GCA 
ala ala 

GCG GCA 
ala ala 

AAT CAG 
asn gin 

TCA GCA 
ser ala 



AGC GGT AAC GAC 
ser gly asn asp 

TAC AGC AGT GTC 
tyr ser ser val 

CTT GCC TCC TTT 
leu ala ser phe 

CCG GGG ACA AAC 
pro gly thr asn 



GGA AAC GGC AAG CCA GTG 
9ly asn gly lys pro val 

GTT GCG GTT TCA GCA ACA 
val ala val ser ala thr 

TCA ACA ACT GGA GAT GAA 
ser thr thr gly asp glu 

ATC ACA AGC ACT TAC TTA 
1le thr ser thr tyr leu 
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FIG. 6-3 

946 AAC CA6 TAT TAT GCA AC6 6GA A6C GGA 
asn gin tyr tyr ala thr gly ser gly 

991 CAC GCC GCT GCC ATG TTT GCC TTG TTA 
his ala a/a a/a met phe a/a leu leu 



1036 GAG ACA 
glu thr 

1081 GAT CTT 
asp leu 

1126 ATC CAG 
lie gin 

117-1 GAG CAA 
glu gin 
EcnAV 

1216 ATC AAC 
1le asn 

1261 GCC AAA 
a la lys 

1306 AGA AAT 
arg asn 

1351 TAT AAA 
tyr lys 

1396 AAG CTG 
lys leu 

1441 GAC CAA 
asp gin 



AAC GTC CAG CTT CGC 
asn val gin leu arg 
Kpnl 

GGT ACC GCA GGC CGC 
gly thr ala gly arg 

TAT AAA GCA CAG GCA 
tyr lys ala gin ala 

GCG GTG AAA AAA GCG 
ala val lys lys ala 

AAA GCG CGA GAA CTC 
lys ala arg glu leu 




ACT GCC 
thr ala 

GTA 

val ly 

ACA CAG 
thr gin 

CCA AAC 
pro asn 

GTA AAA 
val lys 



CTG CAC AAA 
leu Ms lys 

GAt[~ SCG AAA 
asp[ala lys 

CAA ACC GTT 
gin thr val 

GGA ACA GAC 
gly thr asp 

CGA TAC ATC 
arg tyr 11 e 



GAG GAA 
glu glu 

GAT CAG 
asp gin 

ACA GAT 
thr asp 

GAA CAA 
glu gin 

ATC AGC 
1le ser 

AGA CTG 
arg leu 

GAC AAA 
asp lys 

GAC ACA 
asp thr 

AAA AAG 
lys lys 

GCG TCA 
ala ser 



ACA TCC CAA 
thr ser gin 

AAA CAG CGT 
lys gin arg 

ATG CGG AAA 
met arg lys 

CAA TTT GGC 
gin phe gly 

TCA GCG TAC 
ser ala tyr 

ACA AAA GCA 
thr lys ala 

CAG CTG CCG 
gin leu pro 

GAT AAA GTA 
asp lys val 

GTC GCA AAG 
val ala lys 

GCA CAA ACT 
a/a gin thr 

AAC CTT CAA 
asn leu gin 

AAG CART GCG 
lys gli ala 



GCG ACA CCG 
a/a thr pro 

GAT CCT GCC 
asp pro did 

AAC ATC GTT 
asn He val 

TAC GGC TTA 
tyr gly leu 

GCG G(|a GCA 
a/a aid ala 

CAA ATC~GAT 
gin He asp 

AAC TCC GAC 
asn ser asp 

CAG TCA TAC 
gin ser tyr 

GCA GAA AAA 
a/a glu lys 

GCC ATC AAC 
a/a lie asn 

AAA CGC TTA 
lys arg leu 

AAA GAC AAA 
lys asp lys 
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I486 6TT GC6 AAA GCG 
val ala lys ala 

1531 GCA CAA TCA GCA 
ala gin ser aid 
Pstf 

1576 TCC CTG CAG AAA 
ser leu gin lys 



1621 ACG 
thr 



GCA CAG CAA 
ala gin gin 



FIG. 6-4 

GAA AAA AGC AAA AAG AAA ACA GAT GTG 
glu lys ser lys lys lys thr asp val 

ATT GGC AAG CTG CCT GCA AGT TCA GAA 
11e gly lys leu pro ala ser ser glu 

CGC CTT AAC AAA GTG AAl AGC ACC AAT 
arg leu asn lys val lyj ser thr asn 

TCC GTA TCT GCG GCT GAA AAG AAA TCA 
ser val ser ala ala glu lys lys ser 



GAC AGC 
asp ser 

AAA ACG 
lys thr 

TTG AAG 
leu lys 

ACT GAT 
thr asp 



1666 GCA AAT GCG 'GCA AAA GCA CAA TCA GCC 
ala asn ala ala lys ala gin ser ala 



GTC AAT CAG CTT CAA GCA 
val asn gin leu gin ala 



1711 555 hi S Si fn ] K ? f CGG TTA GAC AAA 6TG AA> 
9iy lys asp lys thr ala leu gin lys arg leu asp lys val lys 

1756 JS hi S3 fll fif 551 GAA G 5 A AAA AAA GTG GAA 4 G « AAG 
iys lys val ala ala ala glu a/a lys lys val glu thr ala lys 

1801 In S3 ^S ^ G 5 G GAA AM GAC AAA ACA AAG MA ™ AAG 
a/a lys val lys lys ala glu lys asp lys thr lys lys ser lys 

1846 J2 E SSI CAG TCT G S A GTG , AAT CAA TTA AAA GCA TCC AAT GAA 
thr ser ala gin ser ala val asn gin leu lys ala ser asn glu 

1891 AAA ACA AAG CTG CAA AAA CGG CTG AAC GCC GTC AA~k CCG AAA 
hs thr lys leu gin lys arg leu asn ala val lyj pro lys 

1936 AAG TAA CCAAAAACCTTTAAPATU£CATTCCAA 6TCTTAAARKTTTTT tt 
lys ♦•• 



1994 CATTCTAAGA 
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FIG. 9 
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location of the RP-I gene on the Pst J 
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FIG. 13a 
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The shaded box represent* the region to which the 
KP-n "guess-mer*" hybridized. 
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FIG. 10-1 

-! 99 lAACAAACAGAlAATlAGACCMTTTATTTTSTSAGATTTTATCATTTtiTiTiTiT 

' 1 iI5 £5 */* AC . S *** WC AS * CTC ATC ASC TCT CTT TTA AG! 
«t »r 9 \n lys Mr >/, <» , f p | w /J, „ r „ r 
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136 !S K 25 ffi ffl i* f* #F Tt6 «* ™ AAG AAA AGC 
"* 9iv str ut jln itn lys lit str ttr ttr Uu lys lys ttr 

181 TTT AAA AAG AAA 6AA AAA AC6 ACT TTT CTC milim ni^H 

*. lw <w » ,r, £ S S "5 S ffi $ ffiSJJ 

225 CTG CCT AAC CCA GAA AAA GC6 GCA AAA 6CG GCT STT AAA AAA 6C6 

u* ,u i» ,„ ,„ $ )?, J5J 'J} ffi ffi J" 
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FIG. 10-2 
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FIG, 10-3 



1036 TCA TGG GSA GGG GSC TCT 6GA CTT GAT GAA TGG TAC AGA GAC A1G 
ser trp gly gly gly ser gly ley tsp gh trp tyr irg up met 

1081 GTC AAT GCC TGG C6T TC6 6CC GAT ATT TTC CCT GAG TTT TCA GCG 
vil isn $U xrp irg ser $U up 11e ptu pro glu phe ser */* 

\m GGG AAT ACS GAT CTC TTT ATT CCC GGC GGG CCT 661 TCI AK GCA 
9'ly tsn tnr up ley phe iU pro gly gly pro g)y jer 1h a/a 

1171 AAT CC6 SCA AAC TAI CCA 6AA TCS TTT 6CA ACT CCA GCG ACT GAT 
tsn protli tsn tyr pro glu ftr phe tit thr gly )U thr up 

1216 ATC AAT AAA. AAS CTC GCT SAC TTT TCT CTT CAA GGG CCA TCI CCA 
tit tsn lys lys Jet; tit ttp phe str Itu gin gly pro ser pro 

1261 TAT SAT GAA ATA AAS CCS GAA ATA TCT SCA CCS GGC GTT AAT ATT 
tyr up glu 11* lys pro glu tit ser tit pro gly rtl un lie 

1306 C6T TCA TCC GTT CCC 6GT CAS ACA TAT SAG 6AT 66T TGG GAC GGC 
trg str str vtl pro fly gin thr tyr glu tsp gly trp up gly 

1351 ACA TCA ATS SCA SSS CC6 CAT STA TCC 6CT STT GCT GCA CTG CTG 
thr str net tit. gly pro his rtl str tit rtl »U tU leu leu 

1396 AAA CAS GCG AAT GCC TCA CTT TCT STT SAT 6A6 ATS GAG GAT ATA 
lys gin tit isn »U str Itu str vtl »sp glu met glu up He 

Wl TTA ACC ASC ACS 6CT SAA CCS CTC AC6 SAT TCA ACA TTT CCT GAT 
Itu thr str thr tit glu pro Itu thr tsp ser thr phe pro up 

I486 TCA CCG AAT AAC S6A TAT GGC CAT S6T CT6 6TS AAT GCT TTT GAT 
str pro tsn tsn gly tyr gly his gly Itu vtl m m phe up 

1531 81 51$ TCC 6 ? T 6T I WT {SA TTA 686 A** We SAA GGA CAA 
tit vtl str tit vtl thr tsp gly Itu gly lys iU glu gly gin 

1576 STT TCT 6TA GAG 666 6AT SAC CAA 6A6 CCT CCT STC TAT CAG CAT 
Ml str vtl glu gly tsp tsp gin glu pro pro rtl tyr gin hh 
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FIG. 10-4 

1621 SAG AAA GTA ACT GAA GCT TAT GAA 66T GGC A6C CTA CCA CTG ACT 
glu lys vil thr leu a/a tyr glu gly gly ser leu pro leu thr 

1M TTC ACA 6CT GAA 6AC AAT 6T6 A6T 6T6 ACA TCT GTA AAG CTG TCC 

leu thr a/a glu esp m ki! ser n\ thr ser yi! lys leu ser 

1711 TAC AAG CTT GAT CAA 66T GAA T6G ACA 6AA ATA ACG GCT AAA CGA 
tyr ly$ ltu tsp gin gly glu trp thr glu lit thr a/a lys *rg 

175$ ATC AGC GST GAT CAT CTA AAA GGA ACG TAT CAG 6CA GAG ATC CCA 
1U str gly up his leu lys gly thr tyr gin aid glu He pro 

1801 GAT ATA AAA GGA ACT AAA CTA AGC TAT AAG T6G ATG ATT CAC GAT 

up fit lys gly thr lys leu ser tyr lys trp met 11e his asp 

184$ TTT G6C 6GT CAT GTC GTT TCC TCT GAC GTA TAC GAT GTA ACA GTG 
ph$ gly gly Ms vel vil ser Sir isp vel tyr *sp vtl thr va/ 

1891 AAA CCA AGC ATC AC6 GC6 G6A TAT AAG CAG GAC TTT GAA ACT GCA 
lys pro ser lit thr ill gly tyr lys gin esp pht glu thr a la 

1936 CCC GGC GGC TGG-GTT GC6 A6C GGA ACA AAT AAT AAC TGG GAA TGG 
pro gly gly trp vol tit ser gly thr tsn isn tsn trp glu trp 

1981 Sft 511 25 Hi tf T 8 f C CCA MT 4S A 6 S A SCA T « WA GAA AAA 

gly vol pro ser thr gly pro tsn thr a/a a/a ser gly glu lys 
2026 SIf !£ 6 ? A it 6 MT F 6 A V ATT AT6 CCA CAG CAA ACA 

tyr gly thr *sn ley thr glu He net pro thr gin gin thr 

2071 TGA ACCTT6TTATGCCTCCTATTAAAGCACCTGATTCAGGAA6TCTGTTCCTTCAATT 
OPA 

TAAAAGCT6GCACAATTTAGAGGAT6ATTTT6ATTAC6GCTACGTTTTTGTTCTTCCGGA 
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